{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "celltoolbar": "Slideshow",
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.8.3"
    },
    "colab": {
      "name": "mini-teach-LTP.ipynb",
      "provenance": [],
      "include_colab_link": true
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/RahmanPeimankar/mini-teach-LTP/blob/master/mini-teach-LTP.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BCgEYj_TxXEr"
      },
      "source": [
        "<center>Applied Machine Learning</center>\n",
        "\n",
        "***\n",
        "\n",
        "<center>Mini Teach LTP 2021</center>\n",
        "\n",
        "***\n",
        "\n",
        "<center>Model Evaluation<br></center>\n",
        "\n",
        "***\n",
        "\n",
        "<center>13 January 2021<center>\n",
        "<center>Rahman Peimankar<center>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-oynnkaSxXFC"
      },
      "source": [
        "# Recap Last Week"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DbaRdLpSxXFD"
      },
      "source": [
        "Last week we learned about supervised machine learning models such as: \n",
        "\n",
        "* Decision Tree,\n",
        "* Support Vector Machines,\n",
        "* K-Nearest Neigbor\n",
        "* Random Forest, and\n",
        "* Logist Regression\n",
        "\n",
        "We also learnt how to implement them in Python using different packages!<br>\n",
        "\n",
        "<font color='red'>But <font color='black'>, let's talk about some measures to check how good each model is compared to others on a given dataset! "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DTX6DYVIxXFD"
      },
      "source": [
        "# Recap Quiz 1"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "biCHO_OPxXFE"
      },
      "source": [
        "Classification is ...\n",
        "\n",
        "A) Unsupervised learning<br>\n",
        "B) Reinforcement learning<br>\n",
        "C) Supervised learning<br>\n",
        "D) None<br>\n",
        "\n",
        "Please answer using the link below:<br>\n",
        "https://PollEv.com/multiple_choice_polls/K0UrirMtx8y7FnPxDzYVS/respond"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "buni-23mxXFE"
      },
      "source": [
        "# Recap Quiz 2"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xk3dCgbaxXFF"
      },
      "source": [
        "Classification is appropriate when you ...\n",
        "\n",
        "A) Try to predict a continuous valued output<br>\n",
        "B) Try to predict a class or discrete output<br>\n",
        "C) Both A and B for different contexts<br> \n",
        "D) None<br>\n",
        "\n",
        "Please answer using the link below:<br>\n",
        "https://PollEv.com/multiple_choice_polls/kJMXsQvjoxgL8mVT7wG5X/respond"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HzyoUolhxXFF"
      },
      "source": [
        "# Binary Classification Metrics"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "75x5KeqtxXFG"
      },
      "source": [
        "Imagine that we have a dataset to predict **Heart Disease** using different features (variables).\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-1.jpg?raw=1\" width=\"450\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YhpWd8L8xXFH"
      },
      "source": [
        "To achieve this we could use:\n",
        "<div>\n",
        "<center>\n",
        "<table><tr>\n",
        "<td>\n",
        "    \n",
        "**Logistic Regression**\n",
        "    \n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-2.jpg?raw=1\" width=\"500\"/>\n",
        "    \n",
        "</td>\n",
        "<td>\n",
        "    \n",
        "**K-Nearest Neigbors**\n",
        "    \n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-3.jpg?raw=1\" width=\"500\"/>\n",
        "    \n",
        "</td>\n",
        "    \n",
        "<td>\n",
        "    \n",
        "**Random Forest**\n",
        "    \n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-4.jpg?raw=1\" width=\"700\"/>\n",
        "    \n",
        "</td>\n",
        "</tr></table>\n",
        "    \n",
        "Or any other methods available!\n",
        "   "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ffWTKk2KxXFI"
      },
      "source": [
        "* But how do we decide which one works best?"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lwSRViphxXFJ"
      },
      "source": [
        "# Confusion Matrix"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Habdg5HnxXFK"
      },
      "source": [
        "* We divide the dataset into **Training Data** and **Testing Data**.\n",
        "* Then we **train all of the methods** that we have with the **training data**.\n",
        "* And then **test each method** on the **testing data**.\n",
        "\n",
        "Now we need to summarize how each method performs on the testing data.<br>\n",
        "<font color='green'>One way to do this is by creating the **Confsion Matrix** for each method.<br>\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "esDAj2uvxXFK"
      },
      "source": [
        "<div>\n",
        "<center>\n",
        "<table><tr>\n",
        "<td>    \n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-5.jpg?raw=1\" width=\"700\"/>   \n",
        "</td>\n",
        "<td>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-6.jpg?raw=1\" width=\"700\"/> \n",
        "</td>   \n",
        "<td>   \n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-7.jpg?raw=1\" width=\"700\"/> \n",
        "</td>\n",
        "</tr></table>  "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hSNTZ9pPxXFL"
      },
      "source": [
        "* The rows in a Confusion Matrix corresponds to what the machine learning algorithm predicted.\n",
        "* The columns correspond to the known truth."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "SX6J7Y2dxXFL"
      },
      "source": [
        "# True Positives"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "P0OaAH9ExXFL"
      },
      "source": [
        "* The top left corner contains **True Positives**.\n",
        "* These are the patients that *had heart disease* that were correctly identified by the algorithm. \n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-8.jpg?raw=1\" width=\"750\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uFWxXT22xXFM"
      },
      "source": [
        "# True Negatives"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8ANUqPyYxXFM"
      },
      "source": [
        "* The **True Negatives** are in the bottom right-hand corner.\n",
        "* These are the patients that *did not have heat disease* that were correctly identified by the algorithm.\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-9.jpg?raw=1\" width=\"750\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "M5pqJv_NxXFM"
      },
      "source": [
        "# False Negatives"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CZj99p7VxXFN"
      },
      "source": [
        "* The bottom left-hand corner contains **False Negatives**.\n",
        "* **False Negatives** are when a patient has a heart disease, but the algorithm said they didn't.\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-10.jpg?raw=1\" width=\"750\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "UtnzUChlxXFN"
      },
      "source": [
        "# False Positives"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7cG4hIW1xXFN"
      },
      "source": [
        "* The top right-hand corner contains the **False Positives**.\n",
        "* **False Positives** are patients that do not have heart disease, but the algorithm says they.\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-11.jpg?raw=1\" width=\"750\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "z-NADF04xXFO"
      },
      "source": [
        "# Quick Quiz 1"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jw7IpXwFxXFO"
      },
      "source": [
        "Suppose your classification model predicted true for a class which actual value was false. Then this is a ...\n",
        "\n",
        "A) False positive<br> \n",
        "B) False negative<br>\n",
        "C) True positive<br>\n",
        "D) True negative<br>\n",
        "\n",
        "Please answer using the link below:<br>\n",
        "\n",
        "https://PollEv.com/multiple_choice_polls/MCSLjYG0mRVuaGB1HIB0z/respond"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2quAlEpFxXFO"
      },
      "source": [
        "# How to Interpret the Confusion Matrix"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qK5nstCGxXFP"
      },
      "source": [
        "* The numbers along the diagonal (the <font color='green'>Green Boxes)<font color='black'> tell us how many times the samples were correctly classified.\n",
        "* The numbers *not* on the diagonal (the <font color='red'>Red Boxes)<font color='black'> are samples the algorithm messed up.\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-12.jpg?raw=1\" width=\"550\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "us5WF6mKxXFP"
      },
      "source": [
        "Now we can compare the **Random Forest's Confusion Matrix** to the **Confusion Matrix** we get when we use **K-Nearest Neighbors**."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VNy1o_CjxXFQ"
      },
      "source": [
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-13.jpg?raw=1\" width=\"950\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7YGl5G9gxXFQ"
      },
      "source": [
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-14.jpg?raw=1\" width=\"950\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fs-YeHTzxXFR"
      },
      "source": [
        "* **K-Nearest Neigbors** was worse than the **Random Forest** at predicting patients *with* Heart Disease (**107** vs **142**).\n",
        "* And worse at predicting patients *without* Heart Disease (**79** vs **110**)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "d2Ma8_NjxXFR"
      },
      "source": [
        "# When We Need More Than Confusion Matrix!"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Y7TDem0cxXFS"
      },
      "source": [
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-15.jpg?raw=1\" width=\"950\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "QfOz_BkuxXFS"
      },
      "source": [
        "* These two **Confusion Matrices** are very similar and make it hard to choose which machine learning method is a better fit for this data.\n",
        "* We will talk about more sophisticated metrices, such as **Sensitivity** and **Specificity**, that can help us make a decision in such cases!"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KrtD9yz-xXFT"
      },
      "source": [
        "# Sensitivity & Specificity"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6AmxWf65xXFT"
      },
      "source": [
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-16.jpg?raw=1\" width=\"750\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "IELVoAv1xXFT"
      },
      "source": [
        "# Sensitivity (Se)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "09bg7Mv2xXFT"
      },
      "source": [
        "**Sensitivity** tells us what percentage of patients **_with_** heart disease were correctly identified.\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-17.jpg?raw=1\" width=\"750\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WDyxrIrlxXFU"
      },
      "source": [
        "**Sensitivity**$ = \\frac{True\\,Positives}{True\\,Positives + False\\,Negatives}$"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JTvUBNLsxXFW"
      },
      "source": [
        "# Specificity (Sp)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FWfTF4CqxXFW"
      },
      "source": [
        "**Specificity** tells us what percentage of patients **_without_** heart disease were correctly identified.\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-18.jpg?raw=1\" width=\"750\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tMxLWzMOxXFX"
      },
      "source": [
        "**Specificity**$ = \\frac{True\\,Negatives}{True\\,Negatives + False\\,Positives}$"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Na3gfkvvxXFX"
      },
      "source": [
        "# Let's Calculate Se & Sp"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "RC7LchQGxXFX"
      },
      "source": [
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-15.jpg?raw=1\" width=\"750\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "eURfleASxXFX"
      },
      "source": [
        "# Logistic Regression Model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "aR3LWJbFxXFY"
      },
      "source": [
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-19.jpg?raw=1\" width=\"650\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Mm4qi-DlxXFY"
      },
      "source": [
        "**Sensitivity**$ = \\frac{True\\,Positives}{True\\,Positives + False\\,Negatives} = \\frac{139}{139 + 32} = 0.81$\n",
        "\n",
        "**Specificity**$ = \\frac{True\\,Negatives}{True\\,Negatives + False\\,Positives} = \\frac{112}{112 + 20} = 0.85$\n",
        "\n",
        "* **Sensitivity** tells us that **81%** of the people **_with_** Heart Disease were correctly identified by the model.\n",
        "* **Specificity** tells us that **85%** of the people **_without_** Heart Disease were correctly identified by the model."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bgah1n68xXFZ"
      },
      "source": [
        "# Random Forest Model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pTkW7ScIxXFZ"
      },
      "source": [
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-20.jpg?raw=1\" width=\"650\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MYIqVZduxXFZ"
      },
      "source": [
        "**Sensitivity**$ = \\frac{True\\,Positives}{True\\,Positives + False\\,Negatives} = \\frac{142}{142 + 29} = 0.83$\n",
        "\n",
        "**Specificity**$ = \\frac{True\\,Negatives}{True\\,Negatives + False\\,Positives} = \\frac{110}{110 + 22} = 0.83$"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "x9MFguAGxXFa"
      },
      "source": [
        "# Logistic Regression or Random Forest?"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EcAhgqEPxXFa"
      },
      "source": [
        "<div>\n",
        "\n",
        "<table><tr>\n",
        "<td>\n",
        "    \n",
        "* **Sensitivity** = 0.83\n",
        "* **Specificity** = 0.83\n",
        "    \n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-21.jpg?raw=1\" width=\"700\"/>\n",
        "    \n",
        "</td>\n",
        "<td>\n",
        "    \n",
        "* **Sensitivity** = 0.81\n",
        "* **Specificity** = 0.85\n",
        "    \n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-22.jpg?raw=1\" width=\"500\"/>\n",
        "    \n",
        "</td>\n",
        "</tr></table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "v4oXNe7YxXFb"
      },
      "source": [
        "* **Sensitivity** tells us that the **Random Forest** is slightly better at correctly classifying *positives* cases.\n",
        "* **Specificity** tells us that the **Logistic Regression** is slightly better at correctly classifying *negatives* cases."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Jld2IBHDxXFb"
      },
      "source": [
        "# Implementation Examples in Python"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "EKEY2zE-xXFc"
      },
      "source": [
        "from sklearn.datasets import load_breast_cancer\n",
        "from sklearn.linear_model import LogisticRegression\n",
        "from sklearn.model_selection import train_test_split\n",
        "from sklearn.metrics import classification_report\n",
        "\n",
        "data = load_breast_cancer()\n",
        "X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, stratify=data.target, random_state=0)\n",
        "\n",
        "lr = LogisticRegression().fit(X_train, y_train)\n",
        "y_pred = lr.predict(X_test)\n",
        "\n",
        "from sklearn.metrics import confusion_matrix\n",
        "print(confusion_matrix(y_test, y_pred))\n",
        "print(lr.score(X_test, y_test))\n",
        "print('#'*60)\n",
        "print(classification_report(y_test, y_pred))"
      ],
      "execution_count": 2,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4NamyqbHzMJA"
      },
      "source": [
        "# **Exercise**: \r\n",
        "1. Use \"predict_proba\" function to generate probabilities for the prediction.\r\n",
        "2. Set a threshold of 0.85 for the prediction and compare the results.\r\n",
        "\r\n",
        "The students should commit their answers to GitHub!"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "dQLqSQ-hxXFd"
      },
      "source": [
        "y_pred_thresh = lr.predict_proba(X_test)[:, 1] > .85\n",
        "print(classification_report(y_test, y_pred_thresh))\n",
        "print('#'*60)\n",
        "print(classification_report(y_test, y_pred))"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "gRXuem0-xXFe"
      },
      "source": [
        "# Retrospective"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JWBfAIUgxXFe"
      },
      "source": [
        "Please summarize your learning on today's lecture!<br>\n",
        "\n",
        "https://PollEv.com/free_text_polls/2IugSYc8XgusxydQ2aIzP/respond"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "LGA68a-oxXFe"
      },
      "source": [
        "<font size=\"8\"><center>Thank you!"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kkRRupFqxXFe"
      },
      "source": [
        ""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "SKaYi1JExXFf"
      },
      "source": [
        "# Receiver Operating Characteristic (ROC)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9hk496oUxXFf"
      },
      "source": [
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-26.jpg?raw=1\" width=\"650\"/>\n",
        "</div>\n",
        "\n",
        "* The <font color='blue'> blue dots <font color='black'> represent <font color='blue'> obese <font color='black'> mice ...\n",
        "* The <font color='red'> red dots <font color='black'> represent mice that are <font color='red'> not obese."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3NEEuqM4xXFf"
      },
      "source": [
        "* Now let's fit a Logistic Regression curve to the data\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-27.jpg?raw=1\" width=\"450\"/>\n",
        "</div>\n",
        "    \n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "u-DQLN7RxXFg"
      },
      "source": [
        "* The y-axis is converted to the probability that a mouse <font color='blue'> is obese.\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-28.jpg?raw=1\" width=\"450\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4r8lDAUrxXFh"
      },
      "source": [
        "The curve would tell us that there is a **high** probability that the mouse <font color='blue'> is obese.\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-29.jpg?raw=1\" width=\"450\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "v0g4MDySxXFh"
      },
      "source": [
        "For this mouse, there is a **low** probability that the mouse <font color='blue'> is obese.\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-30.jpg?raw=1\" width=\"450\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "K8Qdj2GQxXFh"
      },
      "source": [
        "# Turn Probabilities into Classification"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zLJeTgs-xXFi"
      },
      "source": [
        "* One way to classify mice is to set a threshold (cutoff) at **0.5**.\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-31.jpg?raw=1\" width=\"450\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pV6xVFXyxXFi"
      },
      "source": [
        "Let's evaluate the effectiveness of this Logistic Regression with the classification threshold set to **0.5**!\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-32.jpg?raw=1\" width=\"450\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-6bijm8ZxXFi"
      },
      "source": [
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-33.jpg?raw=1\" width=\"550\"/>\n",
        "</div>\n",
        "    \n",
        "* Now we can calculate **Se** and **Sp** to evaluate this Logistic Regression at **threshold = 0.5**"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "iGyW5ySkxXFj"
      },
      "source": [
        "# New Threshold"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_yzU-8SsxXFj"
      },
      "source": [
        "* Since we are at the middle ot an epidemi, let assume that we would like to predict infected vs not infected individuals instead of obesity!\n",
        "* Here, it is absolutely essential to correctly classify *every* sample infected in order to minimize the risk of an outbreak... \n",
        "* So, we could lower the threshold to **0.1**.\n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-34.jpg?raw=1\" width=\"450\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "py-sas5fxXFk"
      },
      "source": [
        "* Let's look at the Confusion Matrix for this threshold!\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-35.jpg?raw=1\" width=\"450\"/>\n",
        "</div>\n",
        "\n",
        "* On the other hand, lowering the threshold results in more **False Positives**. "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "naIikG5pxXFk"
      },
      "source": [
        "* What if we set the threshold higher e.g. **0.9**!\n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-36.jpg?raw=1\" width=\"450\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "iXF9KFdtxXFl"
      },
      "source": [
        "* With this data, the higher threshold does abetter job classifying samples.\n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-37.jpg?raw=1\" width=\"450\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6OFbR_sOxXFl"
      },
      "source": [
        "# How do We Determine The Best Threshold?"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2NWjIQ-NxXFl"
      },
      "source": [
        "* We do not need to test every single threshold.\n",
        "* Some thresholds result in the exact same confusion matrix.\n",
        "\n",
        "But even if we made one confusion matrix for each threshold that mattered, it would result in a confusingly large number of confusion matrix!! "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "w2P409SaxXFm"
      },
      "source": [
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-38.jpg?raw=1\" width=\"750\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "cSPMPLgGxXFm"
      },
      "source": [
        "# Receiver Operating Characteristic (ROC)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JzmTQoQYxXFm"
      },
      "source": [
        "* **Receiver Operating Characteristics (ROC)** provides a simple way to summarize all of the information.\n",
        "\n",
        "<div>\n",
        "\n",
        "<table><tr>\n",
        "<td>\n",
        "    \n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-39.jpg?raw=1\" width=\"350\"/>\n",
        "    \n",
        "</td>\n",
        "<td>\n",
        "    \n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-40.jpg?raw=1\" width=\"550\"/>\n",
        "    \n",
        "</td>\n",
        "</tr></table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "w4sCkVwHxXFm"
      },
      "source": [
        "* The y-axis shows the **True Positive Rate**, which is the same thing as **Sensitivity**.\n",
        "<center>True Positive Rate = Sensitivity = $\\frac{True\\, Positives}{True\\, Positives\\, +\\, False\\, Negatives}$</center>\n",
        "\n",
        "* The x-axis shows the **False Positive Rate**, which is the same thing as **1 - Specificity**.\n",
        "<center>False Positive Rate = 1 - Specificity = $\\frac{False\\, Positives}{False\\, Positives\\, +\\, True\\, Negatives}$</center>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TXF7mpP-xXFn"
      },
      "source": [
        "* To get a better sense of how the **ROC** works, let's draw one from start o finish using our example data.\n",
        "* We will start by using a threshold that classifies *all* of the samples as <font color='blue'> obese <font color='black'>...\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-41.jpg?raw=1\" width=\"750\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "rbc2hnMjxXFn"
      },
      "source": [
        "**True Positive Rate** = Sensitivity = $\\frac{True\\, Positives}{True\\, Positives\\, +\\, False\\, Negatives} = \\frac{4}{4 + 0} = 1$\n",
        "* This means that every single <font color='blue'>obese <font color='black'>sample was *correctly* classified.\n",
        "\n",
        "**False Positive Rate** = 1 - Specificity = $\\frac{False\\, Positives}{False\\, Positives\\, +\\, True\\, Negatives} = \\frac{4}{4 + 0} = 1$\n",
        "* This means that every single sample that was <font color='red'>not obese <font color='black'>was *incorrectly* classified as <font color='blue'>obese<font color='black'>."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "msA6CfNWxXFn"
      },
      "source": [
        "* A point at (1,1) means that even though we correctly classified **all** of the <font color='blue'>obese <font color='black'>samples, we *incorrectly* classified **all** of the samples that were <font color='red'>not obese<font color='black'>.\n",
        "* This <font color='green'> green diagonla line <font color='black'>shows where the **True Positive Rate = Fasle Positive Rate**\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-42.jpg?raw=1\" width=\"650\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3yHfxm9IxXFo"
      },
      "source": [
        "* Let's change th threshold to increase it.\n",
        "* Assume that we get the below Confusion Matrix.\n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-43.jpg?raw=1\" width=\"550\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "K1hVvaVHxXFo"
      },
      "source": [
        "* **True Positive Rate** = Sensitivity = $\\frac{True\\, Positives}{True\\, Positives\\, +\\, False\\, Negatives} = \\frac{4}{4 + 0} = 1$\n",
        "* **False Positive Rate** = 1 - Specificity = $\\frac{False\\, Positives}{False\\, Positives\\, +\\, True\\, Negatives} = \\frac{3}{3 + 1} = 0.75$\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-44.jpg?raw=1\" width=\"400\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BiHmV1UDxXFo"
      },
      "source": [
        "* Let's increase the threshold further!\n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-45.jpg?raw=1\" width=\"600\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1kfpk8uBxXFp"
      },
      "source": [
        "* **True Positive Rate** = Sensitivity = $\\frac{True\\, Positives}{True\\, Positives\\, +\\, False\\, Negatives} = \\frac{4}{4 + 0} = 1$\n",
        "* **False Positive Rate** = 1 - Specificity = $\\frac{False\\, Positives}{False\\, Positives\\, +\\, True\\, Negatives} = \\frac{2}{2 + 2} = 0.5$\n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-46.jpg?raw=1\" width=\"400\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "eFUCELDUxXFp"
      },
      "source": [
        "* Let's increase the threshold again!\n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-47.jpg?raw=1\" width=\"600\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9olcrBgYxXFq"
      },
      "source": [
        "* **True Positive Rate** = Sensitivity = $\\frac{True\\, Positives}{True\\, Positives\\, +\\, False\\, Negatives} = \\frac{3}{3 + 1} = 0.75$\n",
        "* **False Positive Rate** = 1 - Specificity = $\\frac{False\\, Positives}{False\\, Positives\\, +\\, True\\, Negatives} = \\frac{1}{1 + 3} = 0.25$\n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-48.jpg?raw=1\" width=\"400\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "SQGM-DnHxXFq"
      },
      "source": [
        "* Let's increase the threshold again!!\n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-49.jpg?raw=1\" width=\"600\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VVoPfcPsxXFq"
      },
      "source": [
        "* **True Positive Rate** = Sensitivity = $\\frac{True\\, Positives}{True\\, Positives\\, +\\, False\\, Negatives} = \\frac{3}{3 + 1} = 0.75$\n",
        "* **False Positive Rate** = 1 - Specificity = $\\frac{False\\, Positives}{False\\, Positives\\, +\\, True\\, Negatives} = \\frac{0}{0 + 4} = 0$\n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-50.jpg?raw=1\" width=\"400\"/>\n",
        "</div>\n",
        "\n",
        "* This threshold resulted in no **False Positives**."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MtsoRC9fxXFr"
      },
      "source": [
        "* Let's increase the threshold and plot the points! \n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-51.jpg?raw=1\" width=\"400\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "efV4TDs8xXFr"
      },
      "source": [
        "* If we increase the threshold to 1, it classifies all of the samples as <font color='red'>not obese <font color='black'>. \n",
        "    \n",
        "<div>\n",
        "\n",
        "<table><tr>\n",
        "<td>\n",
        "    \n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-52.jpg?raw=1\" width=\"350\"/>\n",
        "    \n",
        "</td>\n",
        "<td>\n",
        "    \n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-53.jpg?raw=1\" width=\"350\"/>\n",
        "    \n",
        "</td>\n",
        "</tr></table>\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "u5lvRkipxXFr"
      },
      "source": [
        "# ROC Graph "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3su2_JUbxXFs"
      },
      "source": [
        "* Let's connect the dots, which gives us the ROC graph.\n",
        "* The ROC graph summarizes all of the confusion matrices that each threshold produced.\n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-54.jpg?raw=1\" width=\"400\"/>\n",
        "</div>\n",
        "    \n",
        "* **Exercise**: What is/are the optimal thresholds? "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hk5xD4kyxXFs"
      },
      "source": [
        "# Area Under the Curve (AUC)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HOsdMnD2xXFs"
      },
      "source": [
        "* The **AUC** is equal to **0.9** here.\n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-55.jpg?raw=1\" width=\"400\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pkaMgtWexXFs"
      },
      "source": [
        "* The **AUC** makes it easy to compare one **ROC** curve to another.\n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-56.jpg?raw=1\" width=\"400\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MeMqeOwhxXFt"
      },
      "source": [
        "# Other Metrics"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tAfXk5VoxXFt"
      },
      "source": [
        "* There are other metrics that attempt to do the same thing!\n",
        "\n",
        "Precision = $\\frac{True\\, Positives}{True\\, Positives\\, +\\, False\\, Positives}$\n",
        "\n",
        "* **Precision** is the proportion of positive results that were correctly classified.\n",
        "\n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-57.jpg?raw=1\" width=\"500\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "S1Oa494exXFt"
      },
      "source": [
        "* If there were lots of samples that were <font color='red'>not obese <font color='black'>relative to the number of <font color='blue'>obese <font color='black'>samples, then **Precision** might be more useful than the **False Positive Rate**.\n",
        "    \n",
        "* This is because **Precision** does not include the number of **True Negatives** in its calculation, and is not effected by the imbalance.\n",
        "    \n",
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-58.jpg?raw=1\" width=\"400\"/>\n",
        "</div>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xzJ6EaZLxXFu"
      },
      "source": [
        "# The Zoo"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "y9aVijgKxXFu"
      },
      "source": [
        "<div>\n",
        "<center>\n",
        "<img src=\"https://github.com/RahmanPeimankar/mini-teach-LTP/blob/master/img/quest/Qimage-59.jpg?raw=1\" width=\"6800\"/>\n",
        "</div>\n",
        "\n",
        "https://en.wikipedia.org/wiki/Sensitivity_and_specificity"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jxKVEF0RxXFu"
      },
      "source": [
        "# Implementation Examples"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "-uzYIa0VxXFv",
        "outputId": "3f97f2b2-9c4f-47b6-9c1e-4e3966cd5746"
      },
      "source": [
        "from sklearn.datasets import load_breast_cancer\n",
        "from sklearn.linear_model import LogisticRegression\n",
        "from sklearn.model_selection import train_test_split\n",
        "from sklearn.metrics import classification_report\n",
        "\n",
        "data = load_breast_cancer()\n",
        "X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, stratify=data.target, random_state=0)\n",
        "\n",
        "lr = LogisticRegression().fit(X_train, y_train)\n",
        "y_pred = lr.predict(X_test)\n",
        "\n",
        "from sklearn.metrics import confusion_matrix\n",
        "print(confusion_matrix(y_test, y_pred))\n",
        "print(lr.score(X_test, y_test))\n",
        "print('#'*60)\n",
        "print(classification_report(y_test, y_pred))"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "[[47  6]\n",
            " [ 5 85]]\n",
            "0.9230769230769231\n",
            "############################################################\n",
            "              precision    recall  f1-score   support\n",
            "\n",
            "           0       0.90      0.89      0.90        53\n",
            "           1       0.93      0.94      0.94        90\n",
            "\n",
            "    accuracy                           0.92       143\n",
            "   macro avg       0.92      0.92      0.92       143\n",
            "weighted avg       0.92      0.92      0.92       143\n",
            "\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "stream",
          "text": [
            "C:\\Users\\abpe\\Anaconda3\\lib\\site-packages\\sklearn\\linear_model\\_logistic.py:762: ConvergenceWarning: lbfgs failed to converge (status=1):\n",
            "STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.\n",
            "\n",
            "Increase the number of iterations (max_iter) or scale the data as shown in:\n",
            "    https://scikit-learn.org/stable/modules/preprocessing.html\n",
            "Please also refer to the documentation for alternative solver options:\n",
            "    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression\n",
            "  n_iter_i = _check_optimize_result(\n"
          ],
          "name": "stderr"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "U9EFOJPNxXFy",
        "outputId": "8a5c2ae2-1018-48cf-bb84-42c0e621c4e7"
      },
      "source": [
        "y_pred_thresh = lr.predict_proba(X_test)[:, 1] > .85\n",
        "print(classification_report(y_test, y_pred_thresh))\n",
        "print('#'*60)\n",
        "print(classification_report(y_test, y_pred))"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "              precision    recall  f1-score   support\n",
            "\n",
            "           0       0.85      1.00      0.92        53\n",
            "           1       1.00      0.90      0.95        90\n",
            "\n",
            "    accuracy                           0.94       143\n",
            "   macro avg       0.93      0.95      0.93       143\n",
            "weighted avg       0.95      0.94      0.94       143\n",
            "\n",
            "############################################################\n",
            "              precision    recall  f1-score   support\n",
            "\n",
            "           0       0.90      0.89      0.90        53\n",
            "           1       0.93      0.94      0.94        90\n",
            "\n",
            "    accuracy                           0.92       143\n",
            "   macro avg       0.92      0.92      0.92       143\n",
            "weighted avg       0.92      0.92      0.92       143\n",
            "\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "g8uSFVBOxXFz",
        "outputId": "c96c779f-7a84-49f6-a985-c88bab3324bb"
      },
      "source": [
        "from sklearn.metrics import roc_curve\n",
        "from sklearn.svm import SVC\n",
        "from sklearn.ensemble import RandomForestClassifier\n",
        "import matplotlib.pyplot as plt\n",
        "import numpy as np\n",
        "\n",
        "svc = SVC(gamma=.05).fit(X_train, y_train)\n",
        "rf = RandomForestClassifier(n_estimators=100, random_state=0, max_features=2)\n",
        "rf.fit(X_train, y_train)\n",
        "fpr, tpr, thresholds = roc_curve(y_test, svc.decision_function(X_test))\n",
        "fpr_rf, tpr_rf, thresholds_rf = roc_curve(y_test, rf.predict_proba(X_test)[:, 1])\n",
        "\n",
        "plt.figure(figsize=(7,4))\n",
        "plt.plot(fpr, tpr, label=\"ROC Curve SVC\")\n",
        "plt.plot(fpr_rf, tpr_rf, label=\"ROC Curve RF\")\n",
        "plt.ylabel(\"TPR (Sensitivity)\", fontsize=20)\n",
        "plt.xlabel(\"FPR (1 - Specificity)\", fontsize=20)\n",
        "plt.tick_params(axis='x', labelsize=20)\n",
        "plt.tick_params(axis='y', labelsize=20)\n",
        "plt.legend(fontsize=20)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<matplotlib.legend.Legend at 0x2b65d37d940>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 42
        },
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAd8AAAEYCAYAAAAd7fxUAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nOzdeXgV1fnA8e+bcJOwJBAWRQIhoAgIFZCoFFQWQaFuIBS1FQEVsFWr/qzVai2gWKvigtrasgi41A2qlApCWWQTWsFdZBMwIJvsEEjI8v7+mEnMctfk3tws7+d57jO5c86cOTPJvW/OmTNnRFUxxhhjTMWJiXYFjDHGmJrGgq8xxhhTwSz4GmOMMRXMgq8xxhhTwSz4GmOMMRWsVrQrUF00btxY09LSol0NY4wxlcS6dev2q2oTb2kWfMMkLS2NtWvXRrsaxhhjKgkR+c5XmnU7G2OMMRXMgq8xxhhTwSz4GmOMMRUs6sFXRIaIyAsiskJEjoqIishrZSyruYi8LCK7RCRbRLaLyHMikuxnm+4iMk9EDorICRH5QkTuFpHYsh+VMcYY41tlGHD1B6ATcBzYCbQrSyEicibwEXAaMAfYAFwA3AX0F5EeqnqgxDbXALOBLOAt4CBwFfAs0AP4eVnqYowxxvgT9ZYvcA9wNpAE/Koc5fwVJ/D+RlUHquoDqtoHJ5C2BR4rmllEkoApQB7QS1VvUdX7gM7AamCIiFxfjvoYY4wxXkU9+KrqUlXdrOV4vJKItAYuA7YDfymRPBbIBIaJSN0i64cATYA3VbXwHiFVzcJpjUP5/hkwxhhjvKoM3c7h0MddLlTV/KIJqnpMRFbhBOduwOIS23zgpbzlwAmgu4jEq2p2BOpc9Xw8DY7tiXYtjDEmYo5n5/LV90dodslNpJ7dOWL7qS7Bt6273OQjfTNO8D2bH4Ovz21UNVdEtgEdgNbAN94KFZHRwGiA1NTUMlW8yjj+A7z/f+4biWpVjDEmUuoCFyh8seMCC75BqO8uj/hIL1jfoJzbFKOqk4HJAOnp6WXuNq8SNM9ZXvkspN8c3boYY0w5bdl3nK0/HC+1/rsDJ3hs3jdMa5Ye0f1Xl+AbSEFTLZQAWZZtjDHGVAEjZ/yPHQdP+kyvX9sT0f1Xl+Bb0Eqt7yM9qUS+sm5T/Z06AScPll5/fF/F18UYYyLk5Kl8BnRsyu29zyqVVicultZN6kV0/9Ul+G50l2f7SG/jLote390IpLvbrCuaWURqAa2AXGBr+KpZSWUdhU0LYP17sGUR5Gb5zhsbX3H1MsaYCEquG0fHFF/tr8iqLsF3qbu8TERiio54FpFEnAkzTgJrimyzBPgl0B94o0R5lwB1gOXVdqTzycOwcT6snwPfLoa8U5B4Bpx3E5zeEcTLoKrYOGh3ZcXX1RhjqpkqFXxFxAOcCeSo6rcF61X1WxFZiDOi+XbghSKbjccZwPZ3Vc0ssn4W8ARwvYi8UHCvr4gkABPcPC9F7GCi6Zu58M5IyM+BpOZw/ig45xpofj7ERP3Wb2OMCYt//DeDpxdu9Dpw52DmqajetxH14CsiA4GB7tum7vKnIjLD/Xm/qv7W/TkF57af74C0EkX9Gmd6yedF5FI334VAb5zu5oeKZlbVoyIyCicIfygib+JML3k1zm1Is3CmnKx+9n7tBN5bF0NKV++tXGOMqeK+2HmYE6fyGNK1eak0ERia3iIKtXJEPfjiTOc4vMS61u4LnED7WwJwW7/pwCM4Xck/A3YDzwPjVbXUKCJVfU9EeuIE5sFAArAF+D/g+fLMulUlWOA1xlRx+45m8Yf3vuJkTl6ptM17j5NUuxaPDuwYhZr5F/Xgq6rjgHFB5t2OnxkeVHUHMDLE/a/CCdTGGGOqmM93HmHh+r20PT2ROvHFH0Z3RoMELmjVMEo18y/qwdcYY0zNsf94Nq+u/o6cvPzAmYOwbb8zlOfpoZ2iNnK5LCz4GmOMqTD/Wb+XSYs3UytGwnbVq1HdOE5Lqlq3QVrwra5UYeuHkOVljpB9XqeqNsaYiMt3h9J89EAfTktKiHJtoseCb3W1fxO8OtB3ekLV6Z4xxpjqxoJvVXcq0/vUjwfc26AHPAlpF5dOr3eajXQ2xpgoseBb1U27HPZ+6Tu90Zlw+jkVVx9jjDEBWfCt6k7sh5Y9oMuw0mme2tCqZ8XXyRhjjF8WfKuDRmdC5xuiXQtjTBWiqmTn5pOVk8fJnDxOnnKWWTl5nDyV76zLySPLXV+Qx2v+wvdueW6at3mKThXcYlTDr3pZ8DXGmEomJy/fa+Ara3DMyskvnZ6TR1nm8EvwxFDbE0ttTywJcbGFP9eNr0Wjej++T/DEID7GlZyelECTelXr1qBws+BrjDFBys9XsnJ9B0KfLcMiAbL4+/xiwbPg59z80KOiJ1ZIcANf7biCAOgsmyR6fnwf5z141o6L9b59kTzxtWKIianhTdYwCTr4ikg9nEftXQKkAo1xHtO3D/gMWKqq6yNRSWOM8UdVOZWXT1aJQBg4+BVtOeaXCp5FW4knT+WRnRv6rEwiUMdLcEuoFUv92h6aJsWXDn5BBsMEN5AmeGLxxNoTyaqSgMFXRLrhPDFoCBCP7556FZGNOI/hm6Gqx8JWS2NMlZWbl09WbunAFrjbND9g8Cz6cxkai8TXiikezIoEt+Q6ce7PMf5biT4CZUFwjIv13f1qai6fwVdEzgYmAlcA+cAKYBXwMbAH5/F7tYFGQDvgp0AfYBLwsIiMxXmGbngm8DTGhFV+vjPgJqQu02KB0sdgnRLvc/JCj4qxMUKdEgEvwQ2EjerGUTs59FZi7biYYnnia8USa12oJkr8tXy/wulSfgB4TVV3+8n7IfA3cf696weMAV4EGgCPh6eqxtQMqkpOngYR/EJrJZbcPiunbP8XFw9wP7YcExNqcVpivJ8uUt+BsGQe60I11Z2/4PsA8FdVzQq2MPf5twuBhSLSCTijnPUzplLJy1fv1wRLtfx8BENfwbNESzKvDH2ocbVivLQEneDYoI6nMNgllGwpunkCtxydATfWhWpM+fkMvqr6THkKVtXPgc/LU4YxwSq4Z/Gkz5ahs8zOCaKbtfB9fqngeaoMA25iBOrE1So90tQTS3LdOJoFHHn646AaX8ExwWNdqMZUJaGMdvaoak4kK2Oqp3Dfs/hjWpF7F3Mje8+iz2Dos1v1x/yeWLHWojGmmFDu8/1eRKYDU1R1S6QqZCpOVb1n8TS7Z9EYU8WFEnxjgPuA34rIEuBvwBxVzY1IzUy5/HAsm6teWMnRLO+dFbn5WqYuVJ/3LHpiaVDbQ+2kBL+txIRads+iMcaEEnyb4dzrOxq4FOe2on0i8jIwVVW3RaB+poz2Hs1iz9Es+p1zOmmN6pRKjxH5MfDZPYvGGFOhgg6+qnoK+AfwD/ce4DHATcDvgftFZCHwd2Cu3dtbeQxNb0G/c06PdjWMMcYUUaa+PVXdpKr3AinAjTgTcPQH/glkiMg4EWkWvmoaY4wx1Ue5Lqy5reH3gXeBXThTTzYD/ghsE5HnRKRmP7rCGGOMKaHMwVdEurmjn3cBzwJ1geeBzsDNwEbgTuC5MNTTGGOMqTZCeqSgiCQCw3Cu93bEael+gvMwhX+o6kk36xci8irwAc4grV+FrcY10A/Hslm55Qev97EOyMljz/5MPv1kZ7H1Ow+dLJ3ZGGNMpRDKJBtTgeuAOkA28CrO9JP/85ZfVfNE5EOcUdGmHP6ydAszPtruNa17fA5rth7k95u8TybWsG5cBGtmjDGmLEJp+d4MfItzf+90VT0YxDYfAo+UoV6miFN5+STX8fDe7T1KpTV+OZ4rW55B90t7lUpL8MRyelJCBdTQGGNMKEIJvgNUdUEohavqKpzHEJpyio2JoWWjuqUTYoTEhFokekszxhhTKYUy4Op0ETnXXwYR6SgiN5WzTsYYY0y1FkrwnQEMDJDnGmB6mWtjjDHG1ADhnkA3FijDs2WMMcaYmiOkW42CcDZwKMxlGm9OHISN8yHraLRrYowxJkR+g6/70ISiBopImpessUAqcDHOjFcmEjIPwIZ/w/o5sG0Z5OdC/RbQ9opo18wYY0wIArV8RxT5WXFmr+rsI68C/wXuKX+1TFHNT6znL3kvwMT1oHmQnAY/vR3OuQaanec8588YY0yVESj4tnKXAmzFmSpykpd8ecAhVc0sa0VEpDnOPcH9gUbAbuA9YLyqBuzKFpERBB7sla+qsUW2SQP8PQrxLVW9PtC+I63rwXl00W/g4rudgNv0XAu4xhhThfkNvqr6XcHPIjIeWFp0XbiIyJnAR8BpwBxgA3ABcBfQX0R6qOqBAMV8Boz3kXYxzkxb832kf44T6Ev6KsA+K4hylHo0vvSP0a6IMcaYMAjleb6+Als4/BUn8P5GVV8oWCkiz+B0Yz8G3Bagfp/hBOBSRGS1++NkH5t/pqrjQqyzMcYYUyY+g6+IpLo/fu/O05zqK29JqpoRbF4RaQ1cBmwH/lIieSwwGhgmIveWpVtbRDoC3YDvscFgxhhjKgF/Ld/tOIOo2gObirwPRAOUW1LBgxcWqmp+sYJUj4nIKpzg3A1YHEK5Bca4y2mqmucjTzMRGYNzrfkAsFpVvyjDvowxxpiA/AXJV3AC6ZES78Otrbvc5CN9M07wPZsQg6+I1AZuBPKBqX6y9nNfRbf9EBgeSiveGGOMCYbP4KuqI/y9D6P67vKIj/SC9Q3KUPZQd7v3VXWHl/QTwKM4g622uuvOBcYBvYHFItLZV3e3iIzG6RYnNTXoXnljjDE1XLinl4yEgntqytLqHu0u/+4tUVX3qeofVfUTVT3svpbjtLT/C5wF3OqrcFWdrKrpqprepEmTMlTPGGNMTRR08BWRt0RkgIiEO2AXtGzr+0hPKpEvKCJyDtAd2AnMC2VbVc3lx27qS0LZ1hhjjAkklED6c+DfwPci8pSI/CRMddjoLs/2kd7GXfq6JuxLMAOt/PnBXdqDco0xxoRVKMH3pzjdt3HAvcBnIrJWRO4UkcblqMNSd3lZyVa1iCQCPYCTwJpgCxSRBGAYzkCraWWsVzd3udVvLmOMMSZEQQdfVf2vqv4aOANnINM8nMFJk3Baw/8UkYEiEtKTklT1W2AhkAbcXiJ5PE7L85WCQU8i4hGRdu6sWL78HEgG5vkYaIVb1oUiEudlfR9+nKP6tWCPxRhjjAlGyI8UVNVTwCxglog0wbmVZzgwELgG5z7Z00Is9tc400s+LyKXAt8AF+KMON4EPFQkb4qb/h1OwPamYKCVrxmtCjwBdHBvK9rprjuXH+89flhVPwr6KIwxxpgglGvwlKr+oKrPAl2A3wK5OBNVhFrOt0A6MAMn6N4LnAk8D/w0iHmdC4lIe+Aighto9SrOqObzgVE4/wS0Ad4GLlHVCSEdiDHGGBOEkFu+RYlIW5xW7404LVLBmRQjZG738Mgg8m3nx9uPvKV/4y+9RN5plP2asDHGGFMmIQdfEUkGrscJuufjBLqjOEFspqquCmsNjTHGmGom6OArIlfiBNwrcUY8K7AImAn8U1WzIlJDY4wxppoJpeX7L3e5CSfgvqKq34e/SsYYY0z1FkrwnQzMUNWg77c1xhhjTGlBB19V9fswe2OMMcYEpyo8WMEYY4ypVny2fEVkCc6gquGqutN9HwxV1UvDUjtjjDGmGvLX7dwLJ/jWKfI+GGV59J8xxhhTY/gMvqoa4++9McYYY8rGAqoxxhhTwYIOviLysohcHSDPlSLycvmrZYwxxlRfobR8RwCdA+TphDMLlgmj2nnHyS3fNNzGGGMqkXB3O8cDeWEus2bLyeLso2v4n5wb7ZoYY4wJk1CDr8+RzCISD1wC7ClXjUxx3y4mIT+ThdI92jUxxhgTJn77MkVka4lV94iIt8f+xQJNcFq+fwtT3QzAV/8kM7Y+H/OTaNfEGGNMmAS6kBjDj61dxXl8oLdn5eYAXwKLAXsAfbjknISN8/m6/qXkHbVrvsYYU134/UZX1bSCn0UkH3hWVR+JdKWMa/NCyMnkiwaXOk9MNsYYUy2E0pzqDWyPUD2MN1/9E+o2YXu9TsCBaNfGGGNMmITyVKNlkayIKSH7OGxaAF1+Sf4p63I2xpjqxN+DFW5yf3xXVY8VeR+Qqr5S7prVdJsXQO5J6HAtfBrtyhhjjAknf02qGTiDrNYAx4q890fcPBZ8y+urf0K9ppD6U/j062jXxhhjTBj5C7434wTS3e57b7cYmUjIOgqb/wPpIyHGpt82xpjqxt9TjWaUeD8z4rUxjo3zIS/b6XI2xhhT7VizqjL6+l1Iag7Nz492TYwxxkRAKE81ShaRc9xpJIuuHykic0TkHyJyYfirWMOcPAxbFkGHgdblbIwx1VQo97D8CbgROK1ghYjcCTzHj7NeDRSRdFVdH74q1jAb3of8HOhoXc7GGFNdhdK06gEsVtWTRdb9Fvge54EKQ911/xemutVMX/8TGrSEZudFuybGGGMiJJTgmwJsK3gjIucALYAXVHWlqs4C5uIEYlMWJw7C1g+hwyAQb1NoG2OMqQ5CCb61gawi73vg3Iq0qMi6b3GCtCmLb+ZCfq51ORtjTDUXSvD9HmhX5P3lONP9f15kXTJQtFvahOLrf0LDM6HpudGuiTHGmAgKZcDVUmC4iNyB0wK+GpitqvlF8pwF7Ahj/WqMrMN7iNu6nM9ajuTj5cUfo7x+lz3SyBhjqpNQgu/jwGBgEs7o5uPAuIJEETkN6AlMCWP9aoxNny7nXPL506YU1m7cUCr9vNQGUaiVMcaYSAjlqUbbRKQDMMRd9S9VzSiSpSXwF+AfYaxfjZGf73QgjL+2K606XVQqPb5WbEVXyRhjTISE9Kw6Vd0DvOgj7WPg43BUqiaLqxVDnTh7hKAxxlRnlWYKJRFpLiIvi8guEckWke0i8pyIJIdQxnYRUR+vPX626y4i80TkoIicEJEvRORuEbHmpjHGmLALqYklIh7gGuACnJHN3oKTquotIZZ7JvARzuxZc4AN7j7uAvqLSA9VPRBkcUdwZt0q6biPfV8DzMYZRPYWcBC4CngW53aqnwd/JMYYY0xgQQdfEWkG/AfndiN/M0AoEFLwBf6KE3h/o6ovFNnnM8A9wGPAbUGWdVhVxwWTUUSScAaI5QG9VHWtu/5hYAkwRESuV9U3gz0QY4wxJpBQup2fBtoDbwJ9gDZAKy+v1qFUQERaA5cB23EGbBU1FsgEholI3VDKDdIQoAnwZkHgBVDVLOAP7ttfRWC/xhhjarBQup0vA5ar6i/DXIc+7nJhiXuGUdVjIrLK3Xc3YHEQ5cWLyI1AKk7g/gKn3nl+9v2Bl7TlwAmgu4jEq2p2EPs2xhhjAgql5ZsA/DcCdWjrLjf5SN/sLs8OsrymwKs4XdXP4XQfbxaRnqHsW1VzceayrkWIrXljjDHGn1CC71c49/KGW313ecRHesH6YGaZmA5cihOA6wI/Af4OpAHzRaRTOPctIqNFZK2IrP3hhx+CqJ4xxhgTWvB9CrjafZpRRSoY3KWBMqrqeFVdoqp7VfWEqn6lqrcBz+A8GGJcOPetqpNVNV1V05s0aRJi0cYYY2qqUK757sN5ZOBHIjIJWAcc9pZRVZeHUG5B67K+j/SkEvnK4m/AvZR+3GFF7NsYY4wpJpTg+yFOC1CAh/HfEg1lcoqN7tLXNd027tLXNeFg7HOXJUdMbwTS3X2vK5ogIrVwRm/nAlsxxhhjwiSU4PsIQXT9lsFSd3mZiMQUHfEsIok4E12cBNaUYx8/dZclg+gS4JdAf+CNEmmXAHVwRkrbSGdjjDFhE8qDFcZFogKq+q2ILMS5neh24IUiyeNxWqt/V9VMKJxl60wgR1W/LcjoPvRht6oeLFq+iLTkx/moXyux+1nAE8D1IvJCkUk2EoAJbp6Xyn+UxhhjzI8qywz+v8aZXvJ5EbkU+Aa4EOiN0938UJG8KW76dzijmAv8HHhARJbi3CJ0DCdIX4Fzm9Q8YGLRnarqUREZhROEPxSRN3Gml7wa5zakWThTThpjjDFhE3LwdVuel+LMdlVPVR911yfgDFDaX3KyjEDc1m86Ttd2f+BnwG7geWB8ydasD0txAmYXnG7mujgDwlbi3Pf7qqqW6jZX1ffce4AfwnlecQKwBfg/4Hlv2xhjjDHlEeqDFfoD03DuoxWca8CPusmdgVXAjZS+fhqQqu4ARgaRbzte5pZW1WXAslD36267CifgG2OMMREX9H2+bsv0PZyAew/wj6LpqroGp7t3UDgraIwxxlQ3oUyy8TDOXMfpqvo8P077WNTHQMlZpIwxxhhTRCjBtwfwnqr6fCg9sAM4o3xVMsYYY6q3UIJvPWB/gDx1QizTGGOMqXFCCZTfAx0C5OmMzQZljDHG+BVK8J0PXC4iF3lLFJEBQHfg3+GomDHGGFNdhRJ8H8e5b3ahiDwBnAMgIle479/BuTf3mbDX0hhjjKlGQple8nsRuQx4G7ivSNK/cO67/Ra4VlUDXRc2xhhjarSQJtlQ1U9EpC3OlI0/BRrhPG5vDTBHVXPDX0VjjDGmegl5eklVzcNp7f4r/NUxxhhjqr9y3xYkIskikhyOyhhjjDE1gd/gKyJNRKSPiKR4SesqIp/g3Pu7X0S+EJHukaqoMcYYU10EavneBvwHKNayFZHTgAU49/XmAJlAR2C+iKRGoJ7GGGNMtREo+F4EbFbVr0qsvxNoiHPdNxlogDMCOhG4K9yVNMYYY6qTQMG3DfCZl/VXAfnAr1X1pKrmq+rTwJc4z/o1xhhjjA+Bgu9pwPaiK0SkNk4X8xequqtE/lVAq7DVzhhjjKmGAgXfGJyHJRT1E3f9x17yHwQSwlAvY4wxptoKFHy/B84rse5iQIG1XvInAz+EoV7GGGNMtRUo+H4I/FREbgQQkdOBX+EE3wVe8nfGeaavMcYYY3wIFHyfArKBmSJyECewtgbeVdWMohnd248uwLnua4wxxhgf/AZfVd0EXInzjN4G7urZwCgv2UcDscDCcFbQGGOMqW4Czu2sqkuANiLSBDiiqqd8ZH0GeAE4Gsb6GWOMMdVOKI8U9DuQSlVPlL86xhhjTPVX7gcrGGOMMSY0PoOviLwgIk3LWrCIDBKRG8q6vTHGGFNd+Wv5/hL4VkReEpELgylMROqLyBj3aUezgEbhqKQxxhhTnfi75nsm8CjOKObRIrID5zaitcBu4BDObFaNgHZAN+B8IB74BrhSVedHrurG1CzZ2dkcPHiQY8eOkZeXF+3qGFOjxMbGkpiYSMOGDYmPjy93eT6Dr6oeAu4QkSdwHi04ArjBfWmJ7ALkAYuBvwL/VtX8ctfOGAM4gTcjI4Pk5GTS0tLweDyISLSrZUyNoKrk5ORw9OhRMjIySE1NLXcADuZWox3AQ8BDItIB5zGDqTgt3pPAPuALYIWq2m1GxkTAwYMHSU5OpnHjxtGuijE1jogQFxdX+Pk7ePAgZ5xxRrnKDPpWIwBV/Rr4ulx7NMaE7NixY6SlpUW7GsbUeElJSWzfvr3cwdduNTKmCsjLy8Pj8US7GsbUeB6PJyxjLiz4GlNF2DVeY6IvXJ9DC77GGGNMBbPga4wxxlQwC77GGGNMBas0wVdEmovIyyKyS0SyRWS7iDwnIslBbt9IRG4VkXdFZIuInBSRIyKyUkRuEZFSxyoiaSKifl5vhv9IjTHG1HSVIviKyJnAOmAk8D/gWZxnCN8FrBaRYKap/DkwBbgQ+C/wHM6zhzsCU4G3xfeV8s+B8V5es8p4SMaYCBCRYq/Y2FgaNmxIr169mDFjBqol5/8pbtGiRVx33XWkpqaSkJBAcnIy559/PuPHj+fQoUN+t83Pz2fWrFkMHjyYFi1akJCQQN26dWnfvj2jR49m1apVIR/Phg0buPPOO+nYsSP169cnLi6OZs2accUVVzBt2jSysrJCLrMqOXz4MH/84x/p3Lkz9erVIz4+npSUFLp168a9997Lp59+CsCmTZsQEVJSUgKONF61ahUiQqdOnUql7dixgwceeICuXbuSnJyMx+PhtNNOo2/fvkyaNIkjR45E5Di9UtWwvXDuG769DNstwJk1684S659x1/8tiDL6AFcBMSXWNwUy3HIGl0hLc9fPKO+xd+3aVcvj00VvqI5N0k2fLCtXOaZ6Wr9+fbSrUCm4n1cdO3asjh07Vh988EEdOnSoejweBfT222/3ul1WVpbeeOONCmjt2rX12muv1QceeEDvuOMOPeeccxTQxo0b67Jl3j9/u3fv1h49eiigiYmJeu211+p9992nv/3tb/Waa67RevXqKaDPP/980Mcyfvx4jYmJUUC7deumd955p/7+97/Xm2++WVu3bq2Alvd7pTL7/vvvNS0tTQFt3bq1jh49Wh944AEdNmyYnn/++RoTE6MPPfRQYf6ePXsqoHPmzPFb7ogRIxTQF198sdj6KVOmaHx8vALaqVMn/dWvfqUPPvigjhkzRjt06KCANmrUKKi6B/t5BNaqr5jlKyGUF870ksNxWqt5IW7b2v1AbfMSOBOB40AmULcc9XvQ3ccLJdZb8DVVggVfR0HwLWnlypUaExOjIqJbt24tlT5y5EgF9LzzztOMjIxiafn5+frCCy9oTEyM1qtXr9S5zszM1E6dOimg119/vR48eLBU+UeOHNGHH35YJ0yYENRxPPbYYwpoixYtdM2aNV7zzJ07V3v16hVUeVXRLbfcooDefPPNmp+fXyp9165dum7dusL3r7/+ugJ61VVX+SzzyJEjWqdOHa1Tp44ePny41LbJycn673//2+u2K1eu1E6dOgVV9woJvkAy8DDwL5xu3LuBhCLpVwLrceZ2zgNmBSqzRPm3uh+ov/tIL2gVXxpKuSXKuM8t49kS6wuC70JgjBukxwDnhroPC74mkiz4OnwFX1UtbMG+8847xdavWLGi8It3165dPsu+//77FdC+ffsWWz9hwgQFtEePHpqXl+e3fllZWQGPYdu2berxeNTj8eiXX34ZdHlLly4tbPV707JlS23ZsmWxddOnT1dAp0+frvPnz9eePXtqUlKSArpz506NiYnRLl26+Nz/5ZdfrkCpeq5Zs0YHD9R2LcgAACAASURBVB6sp59+uno8Hm3evLmOHj1av//+e/8HX0T79u0V0E8//TSo/FlZWdqoUSONjY31uZ+XXnpJAR0xYkThuqNHj2rDhg0V0AULFgTcRzDCEXz9XvMVkcY412LHuUF2EPA0ME9EYkVkCjAH56lG7wNdVXWIvzK9aOsuN/lI3+wuzw6xXABEpBZwk/v2Ax/Z+gF/Ax5zl5+LyFIRSQ1Q9mgRWSsia3/44YeyVM8YEybOdx2lZgKbMmUKAKNGjfI7JeD9999PfHw8ixYtYtu2bYXrJ0+eDMDDDz9MTIz/YTLBTLY/ffp0cnJyGDx4MB07dix3ecGYNWsWV155JYmJidx2220MHTqUlJQU+vbty6effsqXX35Zapvdu3ezaNEiunbtWqye06dPp0ePHsyfP5/evXtz9913k56eztSpU0lPTycjIyOoOjVq5Azl2bTJ11d/cfHx8QwbNoy8vDymT5/uNU/R33XRYz948CDdunXjsssuC7iPihJobucHcFqHnwOv43QvDwN64gTby3AGN92lqv8rYx3qu0tfV7oL1jcoY/l/xhl0NU9VF5RIO4Hz2MT3cLrMAc7F+WejN7BYRDqraqa3glV1MjAZID093f9ID2MiZPzcr1m/q3I/0+ScZkmMvapDxMpfvnw5GzduJC4ujgsuuKBY2sqVKwHo27ev3zKSk5Pp2rUrH330EatWraJVq1bs2LGDjIwMatWqRc+ePcNS14L6XHrppWEpLxjz5s1j3rx59O/fv9j6ESNGsHDhQmbOnMnEiROLpb322mvk5eUxfPjwwnWbNm1izJgxpKWlsWzZMlJSUgrTlixZQr9+/bjrrrt49913A9bpuuuuY+XKldx6662sXbuWyy67jC5duhQGZW9Gjx7Nc889x7Rp03jwwQeLzTb16aef8sknn9ChQwe6d+9euD4a5zsYgUY7DwC+Ay5U1Ymq+hTOaOKdOK3FN4Hu5Qi8wSg4uyEHNxH5DXAvsAHnn4ZiVHWfqv5RVT9R1cPuazk//lNxFk63uDGmEhk3bhzjxo3joYce4rrrrqNv376oKhMnTizVut29ezcALVq0CFhuQZ5du3YV27ZRo0YkJCSEpe4FZTZv3jws5QXjmmuuKRV4AQYOHEj9+vV5/fXXS40injlzJh6PhxtuuKFw3UsvvUROTg6TJk0qFngB+vTpw9VXX83cuXM5duxYwDrdfvvt/P73vycnJ4ennnqKfv360bhxY1q1asWoUaP4/PPPS23Tvn17LrroIrZt28bixYuLpRW0ekePHl1sfTTOdzACtXzTcAYjnSpYoaonReTfOM/4fVgL+nrKrqBlW99HelKJfEERkduBSTjXoy9V1YPBbququSIyFecfjUvccoyplCLZoqysxo8fX+y9iDBt2jRGjhzpc5tg5uQt+DoryFvyfThEosxASvYGFKhduzZDhw5lypQpLFiwgJ/97GcArFu3jq+//ppBgwYVe4zl6tWrAVi2bBkff/xxqfL27dtHXl4emzZtomvXrn7rJCL86U9/4ne/+x0LFixgzZo1fPLJJ/z3v/9l6tSpTJ8+nZdeeqlYFzI4XcorV65kypQphb0ZJ0+e5B//+AcJCQkMG1a8nRWN8x2MQMG3NrDXy/p97nKrl7RQbXSXvq7ptnGXwV0YAETkbpx7hb/CCbz7AmziTcFF3Lpl2NYYE0EFX6iZmZmsXr2aW265hdtuu42WLVvSp0+fYnmbNm3Ktm3byMjIoG3btt6KK7Rz506AwtZzs2bNANi/fz9ZWVlhaf02a9aMDRs2FO6rIjRt2tRn2ogRI5gyZQozZ84sDL4zZ84EKNblDHDgwAEAnnrqKb/7O378eNB1a9CgAddddx3XXXcd4PxO//znPzNhwgTuvPNOrr76ak4//fTC/EOHDuXuu+/mvffeY//+/TRu3Ji3336bI0eOcOONN5KcXHxepoLfYUWe72CUa5KNMLR6AZa6y8tKzkIlIolAD+AksCaYwkTkfpzA+xnQu4yBF6CbuwzHPxjGmAioW7cuffv2Ze7cuYXXJ0+cOFEsz0UXXQQ4E2z4c+jQIdatWwdAjx49AKcbOjU1ldzcXJYvXx6WOhfUp2S3aSAFg71yc3O9pvubIMJfq6979+60adOGOXPmcPjwYXJycnjjjTdo3LhxYTAuUL9+/cJ9+RrFq6rluj5et25dHn30US666CKys7NLTV6SkJDAjTfeyKlTp3jllVcAmDp1KlC6yxnKfr4jLZjg21lEbir6AjoDiMiwkmluetBU9VucW33SgNtLJI/HaXm+UjDoSUQ8ItLOnRWrGBF5GGeA1TqcFu9+f/sWkQtFJM7L+j7APe7b10I5HmNMxTv33HMZNWoUO3fu5Nlnny2WduutzrCNqVOnsnevt448x8SJE8nOzqZv3760atWqcH3BF/qECRPIz8/3W4/s7OyAdR05ciQej4fZs2ezfv36oMsraNHt2LGjVL4tW7Zw+PDhgPv2Zfjw4WRnZ/PWW2/x/vvvs3//fn7xi1+UGjnerZvTJlmxYkWZ9xWsxMRE4MdejqIKfidTp05lw4YNrFy5knbt2nHxxReXyjtkyBAaNmzI6tWrA/4DFszvL2z8/fcC5PPj/bslXz7T/JXpYz9n4nRvK87I48eBJe77jUCjInnT3PXbS5Qx3F2fi9PyHeflNaLENh/idC+/427zLLDYLUeBPwR7DHafr4kku8/XUfDZ9Gbnzp2akJCgDRo0KDURxrBhwxTQ9PR03bFjR6ltX3rpJY2NjdV69erp119/XSyt6CQbv/zlL/XQoUOltj927JiOHz8+5Ek20tLS9OOPP/aaZ/78+dq7d+/C96dOndKkpCStX7++7t27t3D9iRMndMCAAQr4vc/Xn4yMDI2JidHu3bvroEGDFNBPPvmkVL5vvvlGPR6PtmnTRjdu3FgqPTs7W5cvX+53XwWefPJJ/eqrr7ymrVixQhMSErRWrVo+7+nt1q2bAnrxxRcroE8//bTPfb322msKaMOGDfWDDz7wmmf16tV+73kuKhz3+Qa65jszUPAOB1X9VkTSgUeA/sDPgN3A88B4DW6wVMG/qrE4E4F4swyYUeT9qzj3Lp+PM7Lbg/NPwNvAi6oa+X/vjDFhkZKSwpgxY5g0aRJPPvkkjz/+eGHa5MmTyc3N5Y033qBt27YMGDCANm3akJmZydKlS/nqq69o1KgRs2fP5pxzzilWbp06dfjggw8YMmQIr7/+OnPnzqVfv36cddZZ5Ofns2XLFhYvXszRo0d58cUXg6rrgw8+SG5uLuPHj+f888+ne/fupKenU69ePfbu3cvy5cvZvHkz6enphdt4PB7uuusuHn30Ubp06cKgQYPIzc3lP//5D82aNSu8tlkWLVq0oHfv3ixevJhatWrxk5/8hC5dupTK165dO15++WVuvvlmOnToQP/+/Tn77LPJyckhIyODFStW0KRJEzZs2BBwn6+//jq/+93vaNeuHd26deOMM84gMzOTr7/+miVLlqCqPP300z6Pa/To0axZs4YVK1YQHx9f6vp0Ub/85S85efIkd9xxB/3796dz5850796d5ORkDhw4wOrVq/n888+LDS6LOF9R2V42w5WpPKzl68BPy1dVdc+ePYXTC+7Zs6dU+oIFC3TIkCGakpKicXFxmpSUpOedd56OHTtWDxw44HffeXl5+vbbb+ugQYM0JSVF4+PjtXbt2tq2bVu95ZZbdNWqVSEfz/r16/WOO+7QDh06aGJiono8Hm3atKn2799fp06dWmrGpfz8fH388ce1devW6vF4tEWLFnrfffdpZmZmwBmuAnn11VcLz+/EiRP95v3iiy90+PDhmpqaqnFxcZqcnKwdOnTQ0aNH6+LFi4M69k8++UQfffRR7d27t6alpWlCQoLGx8dr69at9Re/+IWuWLHC7/aZmZlav359BfSGG24Iap8ZGRn6u9/9Trt06aL169fXWrVqaePGjbVXr1767LPP6pEjR4IqJxwtX3HSTXmlp6fr2rVry7z9Z4vfpPOKMWy+Zi5tulwSxpqZ6uCbb76hffv20a6GMYbgP48isk5V072lBep2RkQaAHcCF+D8V7QG+IuqVuCzl4wxxpjqw2/wdQPv/3AGRBWMVb8CGC4iF6pq2YfXGWOMMTVUoFuN7seZYvEb9+cHcEYfn+W+N8YYY0yIAnU7Xwl8D1ygqicAROSvOHMlXwX8PrLVM8YYY6qfQC3fVsDcgsALoKrHcZ7tmxbBehljjDHVVqDgWwfY42X9Xpx5n40xxhgTonLN7WyMMcaY0AW81Qh3bueS68CZ25kfR0EXUtVXwlA3Y4wxploKJvhe475KEopP1ViUBV9jjDHGh0DB9xWciTWMMcYYEyZ+g6+qjqigehhjjDE1ht8BV+7zec+tqMoYY4wxNUGg0c4zgIEVUA9jjDGmxrBbjYwxxpgKZsHXGFNliEixV2xsLA0bNqRXr17MmDGDQI9IXbRoEddddx2pqakkJCSQnJzM+eefz/jx4zl06JDfbfPz85k1axaDBw+mRYsWJCQkULduXdq3b8/o0aNZtWpVyMezYcMG7rzzTjp27Ej9+vWJi4ujWbNmXHHFFUybNo2srKyQy6wqPvzww1K/T4/HQ7Nmzbj22mtZvny51+1mzJhRaruSr6ogmFuNjDGmUhk7diwAOTk5bNmyhXfffZdly5axdu1aXnzxxVL5s7OzufXWW3nttdeoXbs2AwYM4Oyzz+b48eMsWbKEcePG8eKLLzJ79mwuuaT087T37NnDkCFDWLVqFYmJifTr148zzzwTVWXz5s288cYbTJkyheeff54777wzqGN45JFHGD9+PPn5+XTr1o3hw4dTr1499u7dy4cffsitt97KSy+9RHmeE14VtGzZkhEjRgBw4sQJ1q1bx7vvvst7773HW2+9xc9//nOv23Xq1ImBA6vwVVFV9fkC8oFngNRQXv7KrK6vrl27anl8uugN1bFJuumTZeUqx1RP69evj3YVKgWcWx9LrV+5cqXGxMSoiOjWrVtLpY8cOVIBPe+88zQjI6NYWn5+vr7wwgsaExOj9erVK3WuMzMztVOnTgro9ddfrwcPHixV/pEjR/Thhx/WCRMmBHUcjz32mALaokULXbNmjdc8c+fO1V69egVVXlW0dOlSBbRnz56l0h5//HEFNC0trVTa9OnTFdDhw4dHvpI+BPt5BNaqr/jqK0F/DL55Ib5y/ZVZXV8WfE0kWfB1+Aq+qqrnnHOOAvrOO+8UW79ixQoFNDk5WXft2uWz7Pvvv18B7du3b7H1EyZMUEB79OiheXl5fuuXlZUV8Bi2bdumHo9HPR6Pfvnll0GXVxCsxo4d6zVvy5YttWXLlsXWFQSq6dOn6/z587Vnz56alJSkgO7cuVNjYmK0S5cuPvd/+eWXK1CqnmvWrNHBgwfr6aefrh6PR5s3b66jR4/W77//3v/BF+Ev+O7bt6/wd/3DDz94PaaqHnyDueZ7FMgI4bUjiDKNMSasnO868Hg8xdZPmTIFgFGjRnHGGWf43P7+++8nPj6eRYsWsW3btsL1kydPBuDhhx8mJsb/V2Z8fHzAek6fPp2cnBwGDx5Mx44dy11eMGbNmsWVV15JYmIit912G0OHDiUlJYW+ffvy6aef8uWXX5baZvfu3SxatIiuXbsWq+f06dPp0aMH8+fPp3fv3tx9992kp6czdepU0tPTycjICEudC9SqVT2vjgZzVM+q6iMRr4kxpmzmPwB7Sn95VipNfwID/hyx4pcvX87GjRuJi4vjggsuKJa2cuVKAPr27eu3jOTkZLp27cpHH33EqlWraNWqFTt27CAjI4NatWrRs2fPsNS1oD6XXnppWMoLxrx585g3bx79+/cvtn7EiBEsXLiQmTNnMnHixGJpr732Gnl5eQwfPrxw3aZNmxgzZgxpaWksW7aMlJSUwrQlS5bQr18/7rrrLt59991y1ffvf/87AB07dqRBgwZe83z22WeMGzeu1PqBAwfSuXPncu2/IlTPfymMMdVawZdu0QFXqsrEiRNLtW53794NQIsWLQKWW5Bn165dxbZt1KgRCQkJYal7QZnNmzcPS3nBuOaaa0oFXnACVf369Xn99dd54okniI2NLUybOXMmHo+HG264oXDdSy+9RE5ODpMmTSoWeAH69OnD1Vdfzdy5czl27BiJiYlB1W379u2Fv88TJ06wdu1ali5dSlJSUmEQ9ubzzz/n888/L7U+LS3Ngq8xpgJEsEVZWY0fP77YexFh2rRpjBw50uc2wdyCUtB1XZC35PtwiESZgZTsDShQu3Zthg4dypQpU1iwYAE/+9nPAFi3bh1ff/01gwYNonHjxoX5V69eDcCyZcv4+OOPS5W3b98+8vLy2LRpE127dg2qbt99912p32dycjJLlizxG0SHDx/OjBkzgtpHZWTB1xhT5RQEsMzMTFavXs0tt9zCbbfdRsuWLenTp0+xvE2bNmXbtm1kZGTQtm1bv+Xu3LkToLD13KxZMwD2799PVlZWWFq/zZo1Y8OGDYX7qghNmzb1mTZixAimTJnCzJkzC4PvzJkzAYp1OQMcOHAAgKeeesrv/o4fPx503Xr27MmHH34IwMGDB5k9ezZ33HEHV111FR9//LHfuldlNsmGMabKqlu3Ln379mXu3LmF1ydPnDhRLM9FF10EOBNs+HPo0CHWrVsHQI8ePQCnGzo1NZXc3Fyfkz6EqqA+ixcvDmm7gsFeubm5XtOPHDnic1t/rezu3bvTpk0b5syZw+HDh8nJyeGNN96gcePGhcG4QP369Qv35WsUr6qW+fp4w4YNGTVqFM888ww7d+7k17/+dZnKqQr8Bl9VjbHBVsaYyu7cc89l1KhR7Ny5k2effbZY2q233grA1KlT2bt3r88yJk6cSHZ2Nn379qVVq1aF60ePHg3AhAkTyM/P91uP7OzsgHUdOXIkHo+H2bNns379+qDLS05OBmDHjtI3lGzZsoXDhw8H3Lcvw4cPJzs7m7feeov333+f/fv384tf/KLUyPFu3boBsGLFijLvKxi33XYbHTp04N133y3TzGFVgbV8jTHVwh/+8AcSEhKYOHFisakiL7nkEoYNG8bBgwe58sorvXb3/u1vf+OJJ56gXr16TJo0qVjaPffcQ6dOnVixYgU33XST1yB3/PhxHnnkkVIjhr1JS0tj3LhxnDp1iiuuuMLnDFYffPABAwYMKHzfrl07kpKSmDNnDvv27Stcf/LkSX7zm98E3K8/N910EzExMbzyyiu88sorAIWzThV1xx134PF4uOeee9i0aVOp9FOnToUlMMfGxhZeB37wwQfLXV5lZNd8jTHVQkpKCmPGjGHSpEk8+eSTPP7444VpkydPJjc3lzfeeIO2bdsyYMAA2rRpQ2ZmJkuXLuWrr76iUaNGzJ49m3POOadYuXXq1OGDDz5gyJAhvP7668ydO5d+/fpx1llnkZ+fz5YtW1i8eDFHjx71OrWlNw8++CC5ubmMHz+e888/n+7du5Oenl44veTy5cvZvHkz6enphdt4PB7uuusuHn30Ubp06cKgQYPIzc3lP//5D82aNSu8Pl0WLVq0oHfv3ixevJhatWrxk5/8hC5dupTK165dO15++WVuvvlmOnToQP/+/Tn77LPJyckhIyODFStW0KRJEzZs2FDmuhS49tpr6dy5M8uXL2fBggVcfvnl5S6zUvHXb28vm+HKVA42w5UDPzNcqaru2bNH69Spo3Xq1NE9e/aUSl+wYIEOGTJEU1JSNC4uTpOSkvS8887TsWPH6oEDB/zuOy8vT99++20dNGiQpqSkaHx8vNauXVvbtm2rt9xyi65atSrk41m/fr3ecccd2qFDB01MTFSPx6NNmzbV/v3769SpU0vNmJWfn6+PP/64tm7dWj0ej7Zo0ULvu+8+zczMDDjDVSCvvvpq4fmdOHGi37xffPGFDh8+XFNTUzUuLk6Tk5O1Q4cOOnr0aF28eHFQx+5vhqsC//rXvxTQ9PT0UsdU1We4ElX/TwExwUlPT9fyTID+2eI36bxiDJuvmUubLqUndjc12zfffEP79u2jXQ1jDMF/HkVknaqme0uza77GGGNMBbPga4wxxlQwC77GGGNMBbPga4wxxlSwShN8RaS5iLwsIrtEJFtEtovIcyKSHOlyRKS7iMwTkYMickJEvhCRu0Uk1tc2xhhjTFlVivt8ReRM4CPgNGAOsAG4ALgL6C8iPVT1QCTKEZFrgNlAFvAWcBC4CngW6AH8PBzHaIwxxhSoLC3fv+IEzN+o6kBVfUBV++AEwLbAY5EoR0SSgClAHtBLVW9R1fuAzsBqYIiIXB+G4zOm3Oy2QGOiL1yfw6gHXxFpDVwGbAf+UiJ5LJAJDBORuhEoZwjQBHhTVQtv0lXVLOAP7ttfhXA4xkREbGwsOTk50a6GMTVeTk5Osecel1XUgy9Q8PyvhapabNZyVT0GrALqAN0iUE7BNh94KW85cALoLiLxgQ7CmEhKTEzk6NGj0a6GMTXe0aNHSUxMLHc5lSH4Fjxgs/Qs3Y7N7vLsCJTjcxtVzQW24VwXbx1g38ZEVMOGDTl06BD79+/n1KlT1gVtTAVSVU6dOsX+/fs5dOgQDRs2LHeZlWHAVX136ethlAXrG0SgnHLtW0RGA6MBUlNTA1TPv7qNUvik3iWcXr9Rucox1VN8fDypqakcPHiQ7du3k5eXF+0qGVOjxMbGkpiYSGpqKvHx5e8MrQzBN5CCp0CX91/9spTjdxtVnQxMBmdu57JXDdp0vhg6X1yeIkw1Fx8fzxlnnMEZZ5wR7aoYY8qpMnQ7F7Qu6/tITyqRL5zlhGvfxhhjTNAqQ/Dd6C59XdNt4y59XcstTzk+txGRWkArIBfYGmDfxhhjTNAqQ/Bd6i4vE5Fi9RGRRJyJLk4CayJQzhJ32d9LeZfgjI7+SFWzAx2EMcYYE6yoB19V/RZYCKQBt5dIHg/UBV5R1UwAEfGISDt3Nqsyl+OaBewHrheRwmcuikgCMMF9+1KZD84YY4zxQirDLQtepoX8BrgQ6I3TTdy9YFpIEUnDuQXoO1VNK2s5RbYZiBOEs4A3caaXvBrnNqRZwFAN4iSlp6fr2rVrA2UzxhhTQ4jIOlVN95YW9ZYvFLZa04EZOMHyXuBM4Hngp8HM61zWclT1PaAnzqQag4E7gRzg/4Drgwm8xhhjTCgqRcu3OrCWrzHGmKIqfcvXGGOMqUms5RsmIvID8F05i2mMMwDMlGbnxjc7N/7Z+fHNzo1v4Tg3LVW1ibcEC76ViIis9dVFUdPZufHNzo1/dn58s3PjW6TPjXU7G2OMMRXMgq8xxhhTwSz4Vi6To12BSszOjW92bvyz8+ObnRvfInpu7JqvMcYYU8Gs5WuMMcZUMAu+xhhjTAWz4GuMMcZUMAu+ESQizUXkZRHZJSLZIrJdRJ4TkeRolFOZlPeYRKSRiNwqIu+KyBYROSkiR0RkpYjcUvKxklVJJH7fIjJMRNR93RrO+lakcJ4bEblYRGaLyG63rN0islBEfhaJukdaGL9vrnDPw073c7VVRN4RkZ9Gqu6RJCJDROQFEVkhIkfdz8BrZSwrfJ9NVbVXBF44D3TYCyjwHvBnnOcHK7ABaFSR5VSmVziOCbjNzb8LeB14HHgZOOyun4U7oLAqvSLx+wZauOflmFvOrdE+zmifG+AP7nY/ANOBP+GMbv0YeDLaxxqtcwM84W6zH5jqljMLOAXkAzdG+1jLcG4+c4/pGM6T7hR4LZp/f6pqwTeCv/AF7i/lzhLrn3HX/60iy6lMr3AcE9AHuAqIKbG+KZDhljM42scarb+bItsJsAj4FniqigffcH2mfu7m/w+Q6CXdE+1jjca5cT87ecAe4LQSab3dcrZG+1jLcG56A23cz0KvcgTf8H42o31iquMLaO3+MrZ5CQ6JwHEgE6hbEeVUpldFHBPwoLuPF6J9vNE+N8BdOC2WS4BxVTX4hvEzFQNsdfM2ifZxVbJzc6Fbzhwf6UeBY9E+3nKeqzIF30h8NqvsdbFKro+7XKiq+UUTVPUYsAqoA3SroHIqk4o4phx3mVuOMqIhrOdGRNrjdI1NUtXl4axoFITr3HQHWgHzgEPu9c37ReSuqnpNk/Cdm8043csXiEjjogkicglOkFkUlhpXPWH/3rLgGxlt3eUmH+mb3eXZFVROZRLRYxKRWsBN7tsPylJGFIXt3Ljn4VWcLvgHy1+1qAvXuTnfXe4FPgH+jfMPynPARyKyTES8PoWmEgvLuVHVg8D9wOnAehGZLCKPi8jbwEKcbvoxYahvVRT2761a5aqO8aW+uzziI71gfYMKKqcyifQx/RnoCMxT1QVlLCNawnlu/gh0AS5S1ZPlrVglEK5zc5q7vA2nC7Ev8F+gJfA0cDnwDk73ZFURtr8bVX1ORLbjDF4cVSRpCzBDVfeVtZJVXNi/t6zlGx3iLss7t2e4yqlMynxMIvIb4F6ckYfDwlmpSiKocyMiF+C0dp9W1dURr1XlEOzfTWyR/ENUdbGqHlfVr4FBwE6gZxXugvYm6M+UiPwOZ3TzDJzRvXWBrjjXyV8XkScjVMeqLuTvLQu+kVHwX1B9H+lJJfJFupzKJCLHJCK3A5OA9UBvtwutqin3uSnS3bwJeDh8VYu6cP3dHHKXW1X186IJbg9BQW/JBSHXMHrCcm5EpBfOrUb/UtX/U9WtqnpCVT/B+cfke+BeEWkdhjpXNWH/3rLgGxkb3aWv/v827tLX9YNwl1OZhP2YRORu4EXgK5zAu6fs1YuqcJybeu727YGsIhNrKDDWzTPFXfdcuWtcccL9mTrsI70gONcOsl6VQbjOzZXucmnJBFU9AfwPJ2Z0CbWC1UDYv7fsmm9kFPzxXiYiMUVHx4lIItADOAmsqaByKpOwHpOI3I9znfczpquemgAAC1xJREFUoJ+q7g9zfStSOM5NNjDNR9p5OF+cK3G+TKpSl3S4/m6W44yCbyMicap6qkR6R3e5vfxVrjDhOjfx7tLXgLOC9SXPWU0Q/u/iaN93VV1fhHBDNuAB2gFnlqecqvIK47l52M2/FmgY7eOqTOfGR9njqKL3+Yb57+Y1N/+EEuv74dwTfRhoEO3jrehzAwx18+4BUkqkDXDPzUmq4Kx6RY6jF37u863I72J7nm+EiMiZwEc4oyvn4ExrdiHObCubgO6qesDNm4Yz8vI7VU0razlVRTjOjYgMxxkUkge8gPdrLdtVdUZkjiIywvV346PscThdz6NUdWr4ax9ZYfxMnYZzX+ZZwAqc7tSWONc1FfiFqr4T8QMKozB9pmJwAkxfnKkY38UJxO1xuqQFuFtVJ1XEMYWLiAwEBrpvm+KMaN+K87sH2K+qv3XzplFR38XR/k+kOr9w5tSdDuzG6ar5DmdQUMMS+dJwPvTby1NOVXqV99zwYyvO3+vDaB9nNP9uvJRbcM6qZMs3nOcGaIjTYtnmlnPA/ULtFu1jjOa5wWn53Y3TfXoUp4t+H8790JdF+xjLeF4CfVdsL5K3wr6LreVrjDHGVDAb7WyMMcZUMAu+xhhjTAWz4GuMMcZUMAu+xhhjTAWz4GuMMcZUMAu+xhhjTAWz4GuMMcZUMAu+xlQCIhInIptF5P1o16U6EpHt7nNqS65PEpHn3fRc94ETnUWkl/vzuHLs80P3gRYRIyIviMghEWkcyf2Y8LPga6qdok/y8fEaUSTvOC/pJ0Vkk4j8RUSalyjbW/4sEdkiIpPd6enK4jc40x3+scT+GojIfSLyuoisLxIg+pZxP//f3rkHW11VcfzzjXJ8JCDXkoFEaaaGojFTVGCavOaEQUE6YZqaFqIYGQ5JaGYC04CANvYwMSnFIWmmBzHh6GhDc23EfECaQFiAIoxjooGAYUK4+mPt3/Dj9Dsv7rnnHmh9Zs7sc/dv7cc598xv/fZaa699QEg6Lc3hRUlvSdohaYOkpZKmSjqqmfNpIHOBrwOrgJuBGXhKxS6hEUq9hJn4gQiN6i9oEnGqUXAoM6NM/TMFdY8AHen9scAIYCLwBUlDzWxDBfk24JPAFcBYSWeY2bpaJ5kU17eB35vZypLLJ+IKAvyg99eA42rtuxFIugS4F8/t+wc85+9eYCAwBM/7uxhY38x51cnZZeo/C/zdzEbnKyXtwHMad+aUrEuBIzvRvipm9g9JC4AJkuaa2aauHC9oHKF8g0MWM5teh3hHXl7Su4AH8Zv2jcBXqsi/A1gKjAJuKJCvxEVAb/ygiFJexBPdP21mW9ON9rI6+u4Uko4Efoznux1hZssKZIbTOSXV5RQ8PGX0w48ZLJXfBTzXyTGbpQjvBb4KXIn/VoODgDA7B0EBZrYHuCv9eXoN8m+zT3meVudwl+NJ2pcU9LvNzJaZ2dY6+2wUHwF6AquLFC+AmT1mZvsdTp9Mqx2S+klaKGlLMuevlHRRucEknSPpAUmvJfP2Bkm3SOpdRv59yWe7Lpn/t0p6UtJ3SuT28/nm/LECzsy5EDrS9bLmYUl9JM2UtFrSLknbJf1F0uy8+b3U55senLJzYaeVuC7aJV2V3t9EAZL6StojaVW+3syewM8fHidJ5b7boLWIlW8QlCe7kdUaNJPJ76l5AKkXbrp9Kq22Wo3siLR+ko4ys3/V0fYY/Ai21/GTYHrjZ8beJ6m/md2SF05KZwawFT9FZwtwEjAFGCVpmJntyMkPwY/A64OvXhfjZt4P4z7Q71aY2wLcbTANty4sSPUbK30gSQNxBXoCsBKYhy9iPghMBu4Eyn1H2cPVZezvtsjGXQHMAcZLmmlme0vaj8Pv2T8p6Hs5cDEwGFhd6TMErUEo3+CQpUxQy0ar4YxfSe/EzXgAT9Qg3wO/OQI8WuMUAYYBPfAbbyvyPPAUvppfLmk+rlDXmNnuKm1PAn4FXJgsA0iajSutmZJ+Y2bPp/qzcMX7J2BUfiWdAuTuSdcnp7rDUt99gIvNbFF+YEnHV5pY9huQNA3/TUyv8lkyfo4r3hvM7OaSMY8F3qgw5hJJr+PKt6NoTEkLga/hh9ffn6sXMB7YBSws6P4pXPl+glC+BwWhfINDmWkFdY9Q7FttzynrNvzA7Q/gvsyZVeT7AJ8CBgF/pfKKq5QBqXy5jjZNw8xM0ljcr9gO3J4u7ZH0NL7anJdfkebYC1yXKd7U3wuSfoj/b77EvqC4Sam8otSEbWYLJF2DK5fJqXo0Hoz2u1LFm9psrvezVkPSqcBwPGBvTsGYjfB7z8OV7wRyyhcPABwI3GNm2wvaZRHaAwquBS1IKN/gkMXM6vF/nZle4P7XzbgJcVaZG3lePuMZoL3MzbEcbancVkebuklboL5cWl/Lii8FDp0l6UP4Q8YQ3A+evSZKajezF0qabiqog33m3o/l6obh5vrzJZ1f0OYw4D2S2szsn8DQVP9gtfk3kGzMh/IPFI3EzNZI+iMwUtLxud9eZoW5s0zTLCYg9vseJITyDQJnRp3R0TPMbHqKcu6P+yUnAb+UNLKOm/ObqTy8jrEPhBMptgRMr7UDM1sLrM3+ljQIuBtXnLcB55Y0eaVMV9kqrVeurg2/HxXNMc+7cT90FoD1UtWJN45mjXkHbj4ejwdm9QXGAM+Y2ZNl2hyRyjfLXA9ajIh2DoJOYGZvm9lmM7sG+DVuHry6ji62pLKtolQnMbMOM1Ppq5N9PoebjsH3OZdSbj9y31TmLQTbgW1Fcyx5vZjkM9N0/858hjpp1piL8QeXy3OxBOUCrTKy38+WCjJBCxHKNwgax7XAW8BNknrW2ObZVA7qmil1OTtTWaTIB6g441d7Kp/O1T0OHCNpcI3jPp7KkTXKN4JszHOSxeNAyCKYe5QTSNvcfoor+dH4CvgN4L4K/Wa/n6IEMkELEso3CBpE8o3Ox1ch19bYbA3wKvv8iS2FpIGSJqUtUaXXhGfmgoJEFbiCmZNXVGmrziTgP3jkcMZtqZwvqV/BWEdJyn9HS/HtOWMkfbFAvuGr05R97DHgZOC6gjHbJFVzH2Rbt6oFRt2FK+rb8UCrRWa2s4L80CRf9H8IWpDw+QZBY5mFJ82YLOlH1SJgUzTxb4ErJQ02szWlMpJuZV8gzcdT+U152keAJWb2Pwk6GkQv4AfALZKW49tYdgLvxU3N78dNnUUPG88CZwArJT2c+roA951OzWedMrNlkq7H8yuvk/QA8ALu4z0BD257FPh0kt+dArMeBhZJmoCvTA/H00KeTdfc3y7BA8ZmSfp8ei88Mn4EvgLdWKH933Cf8YWSdgOb8H3kC3Mmdcxsk/yQjTGpqqzJOT0YnQ4sqzPYL+hGQvkGQQMxs5clzQO+AXyL2lbAd+DRrJdSsKICxuIKKM+I3PuNFGTHahBrgfPSeENx5dkH32+6Hn/Y+L6ZvVrQdhtuFp6Lp9vsiW/FurXM9qA5ScFPwh8yPof7gl/CV4KLSuRXSDoZuD6NMxx/MFhP9cCtAyJtlToFmIoHmF0N/Bv/H3yPKj5XM9sr6TxgNp5w5GhceT+KJ/vIczeufFeY2Z8rdHsB/tAxr97PE3QfMuvSE6+CIKgBSQ8BHwUGmtlBH7Ga0io+Ymbt3T2Xg5W0j3waMN7MflZBbgVuIRhckBUraFHC5xsErcEU3LQ8sbsnEnQ/ko4GrsL37/6igty5wKnAlFC8Bxdhdg6CFsDMVkkah5shg/9TJH0GOAWPcj4OV6qVcn4fAUw2s/sryAQtSJidgyBoOGF2PjC078jIV3Cf741dlU0r6F5C+QZBEARBkwmfbxAEQRA0mVC+QRAEQdBkQvkGQRAEQZMJ5RsEQRAETSaUbxAEQRA0mf8CrpYYK0lqIcYAAAAASUVORK5CYII=\n",
            "text/plain": [
              "<Figure size 504x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": [],
            "needs_background": "light"
          }
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "J5b7-y6UxXF0",
        "outputId": "64060500-a4a0-4883-c9e6-5cf5ce77245d"
      },
      "source": [
        "from sklearn.metrics import roc_auc_score\n",
        "np.set_printoptions(precision=3, suppress=True)\n",
        "rf_auc = roc_auc_score(y_test, rf.predict_proba(X_test)[:,1])\n",
        "svc_auc = roc_auc_score(y_test, svc.decision_function(X_test))\n",
        "print(\"AUC for random forest: {}\".format(rf_auc))\n",
        "print(\"AUC for SVC: {}\".format(svc_auc))"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "AUC for random forest: 0.9845911949685534\n",
            "AUC for SVC: 0.8830188679245282\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "r7AcmTdcxXF1"
      },
      "source": [
        "https://www.biostat.wisc.edu/~page/rocpr.pdf"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ifrA2mulxXF1"
      },
      "source": [
        "# Multi Class Confucion Matrix Implementation"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "DbWh5OpcxXF1",
        "outputId": "0e63bcf8-c154-4537-809b-5c857c7139f7"
      },
      "source": [
        "from sklearn.datasets import load_digits\n",
        "from sklearn.metrics import accuracy_score\n",
        "\n",
        "digits = load_digits()\n",
        "# data is between 0 and 16\n",
        "X_train, X_test, y_train, y_test = train_test_split(digits.data / 16., digits.target, random_state=0)\n",
        "\n",
        "lr = LogisticRegression().fit(X_train, y_train)\n",
        "pred = lr.predict(X_test)\n",
        "print(\"Accuracy: {:.3f}\".format(accuracy_score(y_test, pred)))\n",
        "print(\"Confusion matrix:\")\n",
        "print(confusion_matrix(y_test, pred))"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Accuracy: 0.962\n",
            "Confusion matrix:\n",
            "[[37  0  0  0  0  0  0  0  0  0]\n",
            " [ 0 40  0  0  0  0  1  0  1  1]\n",
            " [ 0  0 44  0  0  0  0  0  0  0]\n",
            " [ 0  0  0 43  0  0  0  0  1  1]\n",
            " [ 0  0  0  0 37  0  0  1  0  0]\n",
            " [ 0  0  0  0  0 46  0  0  0  2]\n",
            " [ 0  1  0  0  0  0 51  0  0  0]\n",
            " [ 0  0  0  0  2  0  0 46  0  0]\n",
            " [ 0  3  1  0  0  1  0  0 43  0]\n",
            " [ 0  0  0  0  0  1  0  0  0 46]]\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "stream",
          "text": [
            "C:\\Users\\abpe\\Anaconda3\\lib\\site-packages\\sklearn\\linear_model\\_logistic.py:762: ConvergenceWarning: lbfgs failed to converge (status=1):\n",
            "STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.\n",
            "\n",
            "Increase the number of iterations (max_iter) or scale the data as shown in:\n",
            "    https://scikit-learn.org/stable/modules/preprocessing.html\n",
            "Please also refer to the documentation for alternative solver options:\n",
            "    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression\n",
            "  n_iter_i = _check_optimize_result(\n"
          ],
          "name": "stderr"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "9bRnAb7sxXF2",
        "outputId": "62c3ae85-09b7-471f-95d6-c94aab526294"
      },
      "source": [
        "print(classification_report(y_test, pred))"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "              precision    recall  f1-score   support\n",
            "\n",
            "           0       1.00      1.00      1.00        37\n",
            "           1       0.91      0.93      0.92        43\n",
            "           2       0.98      1.00      0.99        44\n",
            "           3       1.00      0.96      0.98        45\n",
            "           4       0.95      0.97      0.96        38\n",
            "           5       0.96      0.96      0.96        48\n",
            "           6       0.98      0.98      0.98        52\n",
            "           7       0.98      0.96      0.97        48\n",
            "           8       0.96      0.90      0.92        48\n",
            "           9       0.92      0.98      0.95        47\n",
            "\n",
            "    accuracy                           0.96       450\n",
            "   macro avg       0.96      0.96      0.96       450\n",
            "weighted avg       0.96      0.96      0.96       450\n",
            "\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8CY1lBmXxXF2"
      },
      "source": [
        "# Metrics for Regression Models"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "s9cZdp1RxXF2"
      },
      "source": [
        "Build-in standard metrics:\n",
        "\n",
        "* $R^2$  : easy to understand scale\n",
        "* MSE : easy to relate to input\n",
        "* Mean absolute error, median absolute error: more robust\n",
        "\n",
        "**NOTE**: More details on these metrics and how they can be implmented in practice are given in the tutorials. "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wt_9xD8QxXF3"
      },
      "source": [
        "* Mean asolute percentage error (MAPE = $\\frac{100}{n}\\sum_{i=1}^n|\\frac{y-\\hat{y}}{y}|$): Absolute vs Relative "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "wOk1gFP7xXF3",
        "outputId": "fbd45ff5-b655-4006-a487-2bbe6d23d904"
      },
      "source": [
        "from sklearn.metrics.scorer import SCORERS\n",
        "print(\"\\n\".join(sorted(SCORERS.keys())))"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "accuracy\n",
            "adjusted_mutual_info_score\n",
            "adjusted_rand_score\n",
            "average_precision\n",
            "balanced_accuracy\n",
            "completeness_score\n",
            "explained_variance\n",
            "f1\n",
            "f1_macro\n",
            "f1_micro\n",
            "f1_samples\n",
            "f1_weighted\n",
            "fowlkes_mallows_score\n",
            "homogeneity_score\n",
            "jaccard\n",
            "jaccard_macro\n",
            "jaccard_micro\n",
            "jaccard_samples\n",
            "jaccard_weighted\n",
            "max_error\n",
            "mutual_info_score\n",
            "neg_brier_score\n",
            "neg_log_loss\n",
            "neg_mean_absolute_error\n",
            "neg_mean_gamma_deviance\n",
            "neg_mean_poisson_deviance\n",
            "neg_mean_squared_error\n",
            "neg_mean_squared_log_error\n",
            "neg_median_absolute_error\n",
            "neg_root_mean_squared_error\n",
            "normalized_mutual_info_score\n",
            "precision\n",
            "precision_macro\n",
            "precision_micro\n",
            "precision_samples\n",
            "precision_weighted\n",
            "r2\n",
            "recall\n",
            "recall_macro\n",
            "recall_micro\n",
            "recall_samples\n",
            "recall_weighted\n",
            "roc_auc\n",
            "roc_auc_ovo\n",
            "roc_auc_ovo_weighted\n",
            "roc_auc_ovr\n",
            "roc_auc_ovr_weighted\n",
            "v_measure_score\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "stream",
          "text": [
            "C:\\Users\\abpe\\Anaconda3\\lib\\site-packages\\sklearn\\utils\\deprecation.py:143: FutureWarning: The sklearn.metrics.scorer module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.metrics. Anything that cannot be imported from sklearn.metrics is now part of the private API.\n",
            "  warnings.warn(message, FutureWarning)\n"
          ],
          "name": "stderr"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2JztFf9ixXF3"
      },
      "source": [
        "# Imbalanced Data"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "OdprAt96xXF4"
      },
      "source": [
        "* All data is imbalanced\n",
        "* Detect rare events"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "a1SVc5SZxXF4"
      },
      "source": [
        ""
      ],
      "execution_count": null,
      "outputs": []
    }
  ]
}